Members
Overall Objectives
Research Program
Application Domains
New Software and Platforms
New Results
Bilateral Contracts and Grants with Industry
Partnerships and Cooperations
Dissemination
Bibliography
XML PDF e-pub
PDF e-Pub


Section: New Results

Development of FDTB1

Participants : Laurence Danlos, Margot Colinet, Jacques Steinlin.

FDTB1 is the first step towards the creation of the French Discourse Tree Bank (FDTB) with a discourse layer on top of the syntactic one which is available in the French Tree Bank (FTB). In this first step, we have identified all the words or phrases in the corpus that are used as “discourse connectives”. The methodology was the following: first, we highlighted all the items in the corpus that are recorded in LexConn [106] , a lexicon of French connectives with 350 items, next we eliminated some of these items with the following criteria:

  1. first, we filtered out the LexConn items that are annotated in FTB with parts of speech incompatible with a connective use, e.g. bref annotated as Adj instead of Adv, en fait annotated as Pro V instead of (compound) Adv;

  2. second, as we lay down for theoretical and pratical reasons that elementary arguments of connectives must be clauses or VPs, we filtered out e.g. LexConn prepositions that introduce NPs;

  3. last, we filtered out LexConn prepositions and adverbials with a non-discursive function.

The last criterion requires a manual work contrarily to the two others. For example the preposition pour (to), is ambiguous between a connective use (Fred s'est dépeché pour être à la gare à 17h (Fred hurried to be at the station at 17h)) and a preposition introducing a complement (Fred s'est dépeché pour aller à la gare (Fred hurried to go to the station)), and the disambiguation between the two uses is subtle and so the topic of a long paper [22] , whose results have been used to enhance Lefff, [44] .

The FDTB corpus contains 18 535 sentences and FDTB1 identifies 9 833 discourse connectives. This ressource is freely available.